Genome-wide associated variants of subclinical atherosclerosis among young people with HIV and gene-environment interactions

Background Genome-wide association studies (GWAS) have identified some variants associated with subclinical atherosclerosis (SCA) in general population but lacking sufficient validation. Besides traditional risk factors, whether and how would genetic variants associate with SCA among people with HIV (PWH) remains to be elucidated. Method A large original GWAS and gene-environment interaction analysis of SCA were conducted among Chinese PWH (n = 2850) and age/sex-matched HIV-negative controls (n = 5410). Subgroup analyses by age and functional annotations of variants were also performed. Results Different from HIV-negative counterparts, host genome had a greater impact on young PWH rather than the elders: one genome-wide significant variant (rs77741796, P = 2.20 × 10−9) and eight suggestively significant variants (P < 1 × 10−6) were identified to be specifically associated with SCA among PWH younger than 45 years. Seven genomic loci and 15 genes were mapped to play a potential role on SCA among young PWH, which were enriched in the biological processes of atrial cardiac muscle cell membrane repolarization and molecular function of protein kinase A subunit binding. Furthermore, genome-wide interaction analyses revealed significant HIV-gene interactions overall as well as gene-environment interactions with alcohol consumption, tobacco use and obesity among PWH. The identified gene-environment interaction on SCA among PWH might be useful for discovering high-risk individuals for the prevention of SCA, particularly among those with tobacco use and alcohol consumption. Conclusion The present study provides new clues for the genetic contribution of SCA among young PWH and is the starting point of precision intervention targeting HIV-related atherosclerosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-022-03817-6.


Background
Cardiovascular diseases (CVDs) have been identified as a major cause of death among people with HIV (PWH) in the antiretroviral therapy (ART) era [1]. Most forms of CVDs originate from atherosclerosis, a chronic inflammatory disease of blood vessels among elderly population [2]. Of note, there is an increase in incidental atherosclerosis among PWH [3], and HIV infection appears to increase the risk of carotid plaque [4,5]. Atherosclerosis starts early in life and progresses silently [2] and thus it is necessary to identify atherosclerosis from early subclinical stages. We previously observed a disproportionally higher risk and earlier onset of subclinical atherosclerosis (SCA) among young PWH than HIV-negative counterparts in the Comparative HIV and Aging Research in Taizhou (CHART) cohort [6]. This age-specific association between HIV and SCA is independent of traditional risk factors of CVDs and suggestive of unrecognized unique mechanisms linking HIV infection with SCA [6].
Atherosclerosis is a complex disease with the involvement of multiple factors such as smoking, alcohol use and genetics [7]. It has been reported that genetics plays a vital role in atherosclerosis development [8], accounting for 30-50% of the variance in SCA [9]. Genome-wide association studies (GWAS) and meta-analyses have identified a number of genetic variants that contribute to the risk of SCA in the general population [10][11][12]. However, whether and how would the genetic variants associate with SCA differentially among PWH remains to be elucidated, especially in Asian people. The contributing effect of HIV infection could involve different sets of genes and biological pathways in SCA development [9]. A GWAS study conducted in 2010 reported two SNPs (rs2229116 and rs7177922) in tight linkage disequilibrium (LD) in the RYR3 gene associated with SCA in 171 White HIV-infected men, which was also the only GWAS in relation to SCA among PWH [13].
Therefore, in the present study, we conducted a large GWAS of SCA among Chinese PWH and HIV-negative counterparts based on the CHART cohort in an attempt to compare the differences of genetic associations with SCA between these two groups. The possible underlying mechanism of earlier onset of SCA among PWH as previously revealed [6] was explored by age-specific stratified analyses. Furthermore, genome-wide gene-environment interaction analyses of SCA that incorporate HIV infection, alcohol consumption, tobacco use and obesity were also performed.

Study design and participants
Participants were enrolled from the CHART cohort, which is an ongoing prospective cohort study specifically designed to facilitate epidemiological and pathophysiological understandings of aging-related comorbidities among Chinese PWH and comparative HIV-negative individuals [14]. The present cross-sectional study was based on the baseline survey of CHART conducted in 2017-2020. Details about the CHART cohort have been described elsewhere [6].
As of Jan. 2020, an aggregate of 8260 including 2850 PWH and 5410 HIV-negative individuals were enrolled. Eventually included in the analyses were 7904 (95.7%) without missing data on cIMT and after genotyping quality control. Written informed consent was obtained from all study participants. The study was approved by the Institutional Review Board of Fudan University School of Public Health, Shanghai, China.

Data collection and measurements Questionnaire interview and physical examination
A standardized structured questionnaire was administered face-to-face by trained health staffs to collect information on age, sex, tobacco and regular alcohol use, physical activities and history of non-communicable diseases (NCDs). Regular alcohol use was defined as alcohol use at least 3 times per week. Smoking status was classified as "never", "previous" or "current", with current smoking defined as having smoked at least one cigarette in the past 30 days. Physical examinations of waist circumference, hip circumference, height, weight and blood pressures (BP) were carried out. Body mass index (BMI) was calculated and general obesity was defined as BMI ≥ 24 kg/m 2 . The cutoff of waist to hip ratio (WHR) for abdominal obesity was defined as 0.90 for men and 0.85 for women [14].
Hypertension was defined as systolic BP ≥ 140 mmHg or diastolic BP ≥ 90 mmHg, or prior clinical diagnosis of hypertension [15]. Diabetes was defined as HbA1c ≥ 6.5% or a prior clinical diagnosis. Metabolic syndrome (MS) was defined according to standardized protocol [16]. Dyslipidemia was defined as TC ≥ 6.2 mmol/L, LDL ≥ 4.1 mmol/L, HDL < 1.0 mmol/L or TG ≥ 2.3 mmol/L [17]. HIV-related variables were extracted from the national HIV/AIDS Comprehensive Response Information Management System (CRIMS) [18]. Nadir CD4 count was defined as the lowest CD4 count as recorded.

SCA measurements and outcome definition
One of the reliable and valid measure of SCA is carotid intima-media thickness (cIMT) [6,19]. Intima-media thickness (IMT) of the left common carotid artery was measured by trained sonographers using a high-resolution B-mode ultrasound imager (LOGIQ P5 pro, GE, Indianapolis, USA), in accordance with standard procedures. Briefly, an IMT image was obtained on about 10 mm of the longitudinal carotid length which is free of plaque with an identified double-line pattern.
Subclinical carotid atherosclerosis was defined as a cIMT of 780 μm or more, according to our previous published study [6]. The average cIMT values were also categorized into < 780, 780-1000 and > 1000 μm [20]. We assigned two SCA-related phenotypes: one quantitative, using continuous cIMT values and the other categorical, termed "binary-cIMT" with the cutoff of 780 μm.

Genotyping and quality control
Genomic DNA was extracted from whole peripheral blood samples using a commercial DNA extraction kit (Qiagen) and was quantified using PicoGreen reagent (Invitrogen). We genotyped study samples for 664,165 SNPs on the Infinium ™ Chinese Genotyping Array-24 v1.0 BeadChip. We then performed quality control using PLINK 1.9 [21] at sample level and at SNP level according to the following criteria: (1) individual level: call rate < 95%, gender discrepancies checking, heterozygosity rate outliers (> 6 sd.), and unexpected duplicates; (2) SNP level: missing data > 5%, minor allele frequencies (MAF) < 0.05, and deviated from Hardy-Weinberg equilibrium (HWE) (P < 10 -6 ). Principal component analysis (PCA) was done in PLINK 1.9 for the remaining 372,728 SNPs and the first five principal components (PCs) were extracted and employed in further association analyses.

Genome-wide association (GWA) analyses
GWA analyses were performed under additive genetic effects assumption. For continuous phenotypes, linear mixed model (LMM) was applied; for dichotomous phenotypes, generalized linear mixed model (GLMM) was used. LMM-based methods are usually preferred over linear regression-based methods largely because they can account for population stratification [22,23] and relatedness without the need to remove related individuals [24]. LMM was conducted through fastGWA model which is an extremely resource-efficient approach implemented in the GCTA software package [25,26]. GLMM was conducted through fastGWA-GLMM which is a resourceefficient tool for GLMM based GWAS analysis for binary traits in biobank-scale data such as the UK Biobank [24]. For all analyses, we adjusted for following parameters as covariates: age (continuous variable), sex, regular alcohol use, current smoking status, BMI, and the first five PCs. We also created quantile-quantile (QQ) plot and Manhattan plot using the R package "CMplot". A QQ plot was used to evaluate the overall significance of the GWAS, and the deviation of the observed versus the expected distribution of the P values was represented by the inflation factor (λ GC ). We further performed age-specific stratified analyses both in PWH and HIV-negative counterparts. The genome-wide significance threshold was considered at P value less than 5 × 10 -8 , and P value less than 1 × 10 -6 indicated a suggestive significance threshold [27,28]. Plots of representative SNPs were generated using LocusZoom online software [29].

Genome-wide interaction analyses
In order to test the interaction between environmental factors and genetic variants, we conducted a genome-wide interaction analysis by including a two-way interaction parameter based on the equation: Here, Y is the vector of the observed cIMT measurement, β 0 is a constant, β 1 and β 2 are the main effects of SNP and environmental factors, respectively, β 4 is the main effects of other covariates and β 3 is the interaction term to be tested. Environmental factors included HIV infection, alcohol consumption, tobacco use and obesity, respectively.
Age-specific interaction effects of HIV infection and genetic variants on SCA were measured based on the same equation among participants under or above 45 years old (at and above 45 years old).

Statistical analyses
Comparisons of baseline characteristics, stratified by HIV serostatus, were performed using Student's t test and analysis of variance (ANOVA) for normally distributed continuous variables, Mann Whitney U test for continuous variables with skewed distributions, and chi-square test for categorical variables. Distribution of cIMT was also analyzed. Logistic regressions were conducted to examine the association of baseline characteristics and SCA.
We also calculated unweighted and weighted genetic risk scores (GRS) of selected risk variants (P < 1 × 10 -6 ) for SCA. To calculate GRS for the ith subject from the selected risk variants, the following formula was used [30]: Here x ij is the number of risk alleles for the jth SNP in the ith subject ( x ij = 0, 1, or2 ) and w j is the weight or coefficient of the jth SNP. Unweighted genetic risk scores simply counted the number of alleles associated with SCA an individual carried across all potential risk variants, thus giving an equal weight to all risk alleles ( w j =1). Weighted genetic risk scores were calculated likewise, with the associated beta estimates as w j for each selected SNP allele count. Weighting normally results in higher specificity of the GRS by assigning more weights to variants with stronger effects. A P value less than 0.05 served as statistical significance. Data were analyzed with SAS 9.4 software (SAS Institute, Cary, NC, USA).

Functional annotation, gene mapping and gene set analysis
Functional annotation was performed with Functional Mapping and Annotation (FUMA) [31], an online platform for the functional mapping of genetic variants. We first defined 'independent significant SNPs' as those surpassing a predefined threshold P value (1 × 10 -6 ) and showing moderate to low LD (r 2 < 0.6). We further defined 'lead SNPs' as the subset of independent SNPs (r 2 < 0.1). In addition, we defined genomic risk loci by merging LD blocks of independent significant SNPs that have close physical position (< 250 kb). SNPs in genomic risk loci were mapped to genes in FUMA using three strategies: position mapping, expression quantitative trait loci (eQTL) mapping and chromatin interaction mapping. Genes implicated by mapping of GWAS SNPs were further investigated using the GENE2FUNC procedure in FUMA, which provides enrichment of the list of mapped genes in MSigDB gene sets, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Geno Oncology (GO). Details are presented in Additional files 2, 3, 4, 5, 6 and 7.

Demographic characteristics and risk factors of SCA
Finally included in the analyses were 2583 PWH and 5321 HIV-negative individuals. Of them, 74.2% were male and 52.4% (4139/7904) aged less than 45 years old. The cIMT phenotype subordinated an approximately normal distribution. Demographic characteristics of participants by HIV serostatus and SCA were summarized in Table 1. PWH had a higher prevalence of SCA than HIV negative counterparts in different categorial groups. PWH who had an older age, general/abdominal obesity, regular alcohol use, current/previous smoking status, hypertension, diabetes or MS, had a higher prevalence of SCA (all P < 0.05).

Genetic variants associated with SCA
A total of 7904 participants and 372,728 SNPs were subject to final association analyses (Fig. 1). The association analyses with SCA were conducted for all participants, PWH and HIV-negative individuals, respectively. Manhattan plots and QQ plots were shown in Additional file 1: Figs. S1-6.
For binary cIMT, no variant reached the potential significance level among all participants, PWH and HIV negative participants (Additional file 1: Table S3).

Age-specific genetic variants associated with SCA
We further conducted stratified genetic association analyses among PWH and HIV-negative counterparts under or above 45 years old.
For binary cIMT, no variant reached the potential significance level among PWH and HIV negative participants under or above 45 years old (Additional file 1: Table S3).
Among participants above 45 years old, no interaction term reached the suggestive significant level and the most significant interaction effect was observed in rs11948504 (β = 0.19, P interact = 7.16 × 10 -6 ) ( Table 3).

Association of GRS with SCA
Genetic variants potentially associated with cIMT among PWH under 45 years old were selected to calculated for the unweighted and weighted GRS, and associations with cIMT and binary-cIMT were tested by GLM and logistic regression models, respectively. Both univariable and multivariable regression models were fitted adjusting for age, sex, regular alcohol use, current smoking status and BMI.

Functional annotation, gene mapping and gene set analysis
Using three gene mapping strategies in FUMA, we identified 7 genomic risk loci and 15 mapped genes associated with cIMT among PWH under 45 years old (Additional file 1: Tables S6, 7), 6 genomic risk loci and 11 mapped genes among HIV-negative counterparts under 45 years old (Additional file 1: Tables S8, 9).
Among PWH under 45 years old, positional gene mapping aligned SNPs to 5 genes by genomic location, eQTL gene mapping matched SNPs to 8 genes by expression levels they influence, and chromatin interaction mapping annotated SNPs to 4 genes on the basis of 3D DNA-DNA interactions (Additional file 1: Table S7, S10-11). Of note, the variant rs2507941 was also mapped to SCN5A gene through chromatin interaction mapping (Additional file 1: Table S7). Eleven genes were notable as they were linked via     Gene-set based analysis was performed to further evaluate the underlying disease mechanisms responsible for the genetic signals. The 2 significant GO biological processes and 6 significant GO molecular functions were identified among PWH under 45 years old (Table 5). Among those gene sets, there were 2 GO gene sets involved in pathogenesis of SCA, including regulation of atrial cardiac muscle cell membrane repolarization (FDR = 0.034) and molecular function of protein kinase A (PKA) subunit binding (FDR = 0.018). Gene-set results among HIV-negative individuals under 45 years old were also shown in Table 5.

Discussion
This study for the first time comprehensively compared and evaluated the genome-wide associated variants and gene-environment interaction in relation to SCA among PWH and HIV-negative individuals in Chinese population, indicating that the host genome had a greater impact on SCA among young PWH than the elder PWH. Nine novel genetic variants, seven genomic loci and 15 mapped genes were identified to be associated with SCA among PWH under 45 years old. Genetic variants had a significant interaction with HIV infection, tobacco use, alcohol use and obesity on the development of SCA. Aggregations of the identified genetic variants were highly associated with SCA among young PWH, as predicted by GRS. Using gene-set analyses, we demonstrated that genetic variants of SCA among PWH under 45 years old pointed towards a role of genes enriched in the biological process of cardiac muscle cell repolarization and molecular function of PKA subunit binding.
We previously reported that SCA could occur early in young HIV-infected adults in the CHART cohort [6]. Based on the same cohort, we found in the present study that one significant variant rs77741796 near PTPRQ gene and eight suggestive significant variants at KCNQ1/FER/PJA2/ITGA9/EPHA6 genes were associated with SCA among PWH under 45 years old ( Table 2). There was no significant variant associated with SCA among PWH above 45 years old but significant variants can be found among HIV-negative individuals both Table 5 The significant gene-set analyses for cIMT among participants under 45 years old  under and above 45 years old. These results indicated that genetic predisposition may play a crucial role in the development of SCA among young HIV-infected adults instead of old PWH. Elderly PWH population usually have a higher prevalence of multimorbidity and traditional risk factors of CVDs, such as hypertension and metabolic syndrome [6,32], and thus the role of genetics may be overshadowed. On the contrary, among young PWH with less traditional risk factors and accordingly lower prevalence of CVDs, the role of genetics may become prominent in the debut and progression of atherosclerosis. To what extent and how will the genetic variants impact on the development of SCA among PWH remains to be addressed in longitudinal prospective cohort studies.
In genome-wide interaction analyses among participants under 45 years old, we identified four genetic variants at RBFOX1/CBLN1/ITGA9 that had a suggestively significant interaction with HIV infection, indicating an age-specific interaction effect of HIV infection and genetic variant on the development of SCA. rs2507941 at ITGA9 was associated with cIMT both in GWA among young PWH and genome-wide interaction analyses. The protein that ITGA9 encodes can improve cell migration and regulate various cellular biological functions [33]. It was also reported that human ITGA9 was associated with blood pressure and linked to cardiovascular phenotypes [34]. The protein that RBFOX1 encodes is a muscle-specific isoform of an RNA splicing regulator and previous study identified that regulation of RNA splicing by RBFOX1 played a crucial role in transcriptome reprogramming during heart failure [35]. CBLN1 encodes a cerebellum-specific precursor protein that establishes parallel fiber-Purkinje cell synapses [36] but its role in SCA development was firstly reported.
For risk variants of cIMT to young PWH, a positive interaction effect with HIV infection was also identified among all young participants. This might be partially owing to a mixture of accelerated aging due to HIV infection and host genomic effects in the HIV-infected youngsters who had less traditional risk factors for SCA. In addition, risk variants of cIMT to HIV-negative individuals also had a negative interaction with HIV infection among all participants. The significant variants related to SCA among PWH and HIV-negative counterparts under 45 years old were also different. The underlying mechanism might be attributable to the integration of HIV proviral DNA into host genome, which could affect expression of host genes, influence basal and inducible transcription [37,38], and thus manifest differential associations of genetic variants with SCA between comparable PWH and HIV-negative counterparts.
Genome-wide interaction analyses with traditional risk factors of SCA were also performed among PWH. These analyses identified variants at NKAIN2/RAD23B/ ADCY8/CDH26/SLC2A7/CDH12/NUDT12 had a genome-wide significant interaction with alcohol consumption or tobacco use (Table 4). Previous study also revealed the strong association of NKAIN2 with alcohol dependence [39] and nicotine dependence [40]. The protein that RAD23B encodes is shown to elevate the nucleotide excision activity of 3-methyladenine-DNA glycosylase and plays a role in DNA damage recognition in base excision repair [41], the latter of which was usually caused by tobacco usage [42]. It was also reported that overexpression of a neuronal ADCY8 in sinoatrial node markedly impacted on heart rate and rhythm [43]. Cadherins (CDHs) formed adherens junctions and were known stabilizers of atherosclerotic plaques [44]. Overexpression of CDH12 and CDH26 might be related to myocardial infarction and progression of atherosclerosis [44]. SLC2A7 encodes a protein that catalyzes the uptake of sugars [45] through facilitated diffusion while NUDT12 regulates the concentrations of individual nucleotides [46], but their links to SCA were first reported in our study. Potential risk variants of cIMT to young PWH also had an interaction with alcohol consumption, tobacco use and obesity. These results strongly highlight the importance of controlling traditional risk factors of SCA, such as reducing alcohol use, smoking cessation and maintaining a good weight among PWH carrying highrisk alleles in an attempt to reduce SCA risk.
Using functional annotation of associated genetic variants, we found variants at KCNQ1 and SCN5A were associated with SCA among PWH under 45 years old. The KCNQ1 gene encodes a voltage-gated potassium channel required for repolarization phase of the cardiac action potential [47]. A cohort study in Japan has reported that SNPs at KCNQ1 were significantly associated with coronary epicardial endothelial dysfunction [48]. Animal experiment has confirmed that an imprinted antisense IncRNA in the KCNQ1 gene promotes macrophage lipid accumulation and accelerates the development of atherosclerosis through the miR-452-3p/HDAC3/ABCA1 pathway [8]. Protein encoded by SCN5A was primarily found in cardiac muscle and defects in this gene have been associated with atrial fibrillation (AF) and cardiomyopathy [49]. Previous study also indicated variants at SCN5A were related to increased AF risk and PR interval [50] but its relation to SCA was firstly reported.
Moreover, three SNPs-rs6762348, rs62263680 and rs62262941 located at EPHA6 on chromosome 3 were identified to be associated with SCA among PWH under 45 years old. EPHA6 gene is predicted to enable transmembrane-ephrin receptor activity and is found to be associated with insulin signaling [51] and blood pressure phenotype [52], which are the known risk factors of atherosclerosis. We also identified that genetic variants at ITGA9, FER, PJA2, PTPRQ genes were significantly associated with SCA among PWH under 45 years old. Variants near PTPRQ reached genome-wide significance to cIMT among young PWH; this gene encodes a member of the type III receptor-like protein-tyrosine phosphatase family, playing roles in cellular proliferation and differentiation [53], which might have a link to cardiovascular disease [54]. FER regulated cell-cell adhesion and absence of FER protein tyrosine kinase could induce epithelial barrier dysfunction [55] which was regarded as a hallmark of many human panvascular diseases, including atherosclerosis, hypertension and diabetes [56]. One study demonstrated the association of PJA2 with atherosclerosis through protein-protein interaction network analysis [57]. The unweighted and weighted GRSs were significantly associated with SCA among PWH under 45 years old, which might be used as the predictive biomarker panel of SCA among young HIV-infected adults.
The gene set analyses revealed that genes related to SCA among PWH under 45 years old were enriched in regulation of atrial cardiac muscle cell membrane repolarization and molecular function of protein kinase A (PKA) catalytic subunit binding. KCNQ1 and SCN5A participated in the regulation of atrial cardiac muscle cell membrane repolarization which was involved in the process that modulates the establishment or extent of a membrane potential in the polarizing direction towards the resting potential in an atrial cardiomyocyte [58]. Dysregulation of atrial cardiac muscle cell membrane repolarization is related to long QT syndrome, sudden cardiac death, cardiac death and death from any cause [59][60][61]. KCNQ1 and PJA2 were involved in the catalytic subunit binding of PKA which is one of the master regulatory molecules in the heart. It has been reported that persistent activation of PKA signaling was linked to pathological hypertrophy and the progression to heart failure [62].
To our knowledge, this is the largest GWAS of SCA among comparative PWH and HIV-negative counterparts in Asia, and is the first that measured the genomewide interaction effect of environmental factors and genetic variants on SCA. Nevertheless, our study has several limitations. First, replication study was not conducted, which may reduce the robustness of our results to some extent. However, using the stringent P-value could reduce the false discovery rate and candidate SNPs were presented for future validation. Second, since all genetic data were available within one cohort and were obtained using a single chip, no imputation of SNP genotypes was performed. Results of imputation analyses will also be reported in future work. Last, sample size for PWH under 45 years old was relatively small, although genome-wide significant variants were still identified. Future studies with a larger sample size are needed to validate these results.

Conclusion
In summary, the present GWAS indicated a greater impact of host genome on SCA among young Chinese PWH, as well as the interaction effects between genetic variants and environmental factors on HIV-related SCA development. Nine genetic variants, seven genomic loci and 15 mapped genes were identified to be associated with SCA among PWH under 45 years old. Pathways related to biological processes of atrial cardiac muscle cell membrane repolarization and molecular function of PKA subunit binding were implicated in pathogenesis of SCA in HIV-infected youngsters. Furthermore, the identified gene-environment interaction on SCA among PWH might be useful for discovering high-risk individuals for the prevention of SCA, particularly among those with tobacco use and alcohol consumption. The current study provides new clues for the causal mechanism of SCA among young Chinese HIV-infected adults, and is the starting point of precision intervention targeting HIV-related atherosclerosis.